121 research outputs found

    Ranking relations using analogies in biological and information networks

    Get PDF
    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S={A(1):B(1),A(2):B(2),…,A(N):B(N)}\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}, measures how well other pairs A:B fit in with the set S\mathbf{S}. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S\mathbf{S}? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Taste or Addiction?: Using Play Logs to Infer Song Selection Motivation

    Full text link
    Online music services are increasing in popularity. They enable us to analyze people's music listening behavior based on play logs. Although it is known that people listen to music based on topic (e.g., rock or jazz), we assume that when a user is addicted to an artist, s/he chooses the artist's songs regardless of topic. Based on this assumption, in this paper, we propose a probabilistic model to analyze people's music listening behavior. Our main contributions are three-fold. First, to the best of our knowledge, this is the first study modeling music listening behavior by taking into account the influence of addiction to artists. Second, by using real-world datasets of play logs, we showed the effectiveness of our proposed model. Third, we carried out qualitative experiments and showed that taking addiction into account enables us to analyze music listening behavior from a new viewpoint in terms of how people listen to music according to the time of day, how an artist's songs are listened to by people, etc. We also discuss the possibility of applying the analysis results to applications such as artist similarity computation and song recommendation.Comment: Accepted by The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017

    Refining cellular pathway models using an ensemble of heterogeneous data sources

    Get PDF
    © Institute of Mathematical Statistics, 2018. Improving current models and hypotheses of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of new high-throughput studies. Moreover, the available sources of data are heterogeneous, and the data need to be integrated in different ways depending on which part of the pathway they are most informative for. In this paper, we introduce a compartment specific strategy to integrate edge, node and path data for refining a given network hypothesis. To carry out inference, we use a local-move Gibbs sampler for updating the pathway hypothesis from a compendium of heterogeneous data sources, and a new network regression idea for integrating protein attributes. We demonstrate the utility of this approach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae.This work was supported, in part, by NIH grant R01 GM-096193, NSF CAREER grant IIS-1149662, and by MURI award W911NF-11-1-0036 to Harvard University. EMA is an Alfred P. Sloan Research Fellow and a Shutzer Fellow at the Radcliffe Institute for Advanced Studies. FM acknowledges support from the University of Cambridge, Cancer Research UK (C14303/A17197), and Hutchison Whampoa Limited. FM and EMA contributed equally to this work

    The interplay of microscopic and mesoscopic structure in complex networks

    Get PDF
    Not all nodes in a network are created equal. Differences and similarities exist at both individual node and group levels. Disentangling single node from group properties is crucial for network modeling and structural inference. Based on unbiased generative probabilistic exponential random graph models and employing distributive message passing techniques, we present an efficient algorithm that allows one to separate the contributions of individual nodes and groups of nodes to the network structure. This leads to improved detection accuracy of latent class structure in real world data sets compared to models that focus on group structure alone. Furthermore, the inclusion of hitherto neglected group specific effects in models used to assess the statistical significance of small subgraph (motif) distributions in networks may be sufficient to explain most of the observed statistics. We show the predictive power of such generative models in forecasting putative gene-disease associations in the Online Mendelian Inheritance in Man (OMIM) database. The approach is suitable for both directed and undirected uni-partite as well as for bipartite networks

    Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells

    Get PDF
    Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We analyze genome-wide histone acetylation patterns and gene expression profiles measured over the first five days of cell differentiation triggered by silencing Nanog, a key transcription factor in ESC regulation. We explore the temporal and spatial dynamics of histone acetylation data and its correlation with gene expression using supervised and unsupervised statistical models. On a genome-wide scale, changes in acetylation are significantly correlated to changes in mRNA expression and, surprisingly, this coherence increases over time. We quantify the predictive power of histone acetylation for gene expression changes in a balanced cross-validation procedure. In an in-depth study we focus on genes central to the regulatory network of Mouse ESC, including those identified in a recent genome-wide RNAi screen and in the PluriNet, a computationally derived stem cell signature. We find that compared to the rest of the genome, ESC-specific genes show significantly more acetylation signal and a much stronger decrease in acetylation over time, which is often not reflected in an concordant expression change. These results shed light on the complexity of the relationship between histone acetylation and gene expression and are a step forward to dissect the multilayer regulatory mechanisms that determine stem cell fate.Comment: accepted at PLoS Computational Biolog

    Design options, Implementation Issues and Evaluating Success of Ecologically Engineered Shorelines

    Get PDF
    Human population growth and accelerating coastal development have been the drivers for unprecedented construction of artificial structures along shorelines globally. Construction has been recently amplified by societal responses to reduce flood and erosion risks from rising sea levels and more extreme storms resulting from climate change. Such structures, leading to highly modified shorelines, deliver societal benefits, but they also create significant socioeconomic and environmental challenges. The planning, design and deployment of these coastal structures should aim to provide multiple goals through the application of ecoengineering to shoreline development. Such developments should be designed and built with the overarching objective of reducing negative impacts on nature, using hard, soft and hybrid ecological engineering approaches. The design of ecologically sensitive shorelines should be context-dependent and combine engineering, environmental and socioeconomic considerations. The costs and benefits of ecoengineered shoreline design options should be considered across all three of these disciplinary domains when setting objectives, informing plans for their subsequent maintenance and management and ultimately monitoring and evaluating their success. To date, successful ecoengineered shoreline projects have engaged with multiple stakeholders (e.g. architects, engineers, ecologists, coastal/port managers and the general public) during their conception and construction, but few have evaluated engineering, ecological and socioeconomic outcomes in a comprehensive manner. Increasing global awareness of climate change impacts (increased frequency or magnitude of extreme weather events and sea level rise), coupled with future predictions for coastal development (due to population growth leading to urban development and renewal, land reclamation and establishment of renewable energy infrastructure in the sea) will increase the demand for adaptive techniques to protect coastlines. In this review, we present an overview of current ecoengineered shoreline design options, the drivers and constraints that influence implementation and factors to consider when evaluating the success of such ecologically engineered shoreline

    Guidelines for the use and interpretation of assays for monitoring autophagy (2nd edition)

    Get PDF
    In 2008 we published the first set of guidelines for standardiz- ing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new tech- nologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in differ- ent organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes..

    Combined node and link partitions method for finding overlapping communities in complex networks

    Get PDF
    Community detection in complex networks is a fundamental data analysis task in various domains, and how to effectively find overlapping communities in real applications is still a challenge. In this work, we propose a new unified model and method for finding the best overlapping communities on the basis of the associated node and link partitions derived from the same framework. Specifically, we first describe a unified model that accommodates node and link communities (partitions) together, and then present a nonnegative matrix factorization method to learn the parameters of the model. Thereafter, we infer the overlapping communities based on the derived node and link communities, i.e., determine each overlapped community between the corresponding node and link community with a greedy optimization of a local community function conductance. Finally, we introduce a model selection method based on consensus clustering to determine the number of communities. We have evaluated our method on both synthetic and real-world networks with ground-truths, and compared it with seven state-of-the-art methods. The experimental results demonstrate the superior performance of our method over the competing ones in detecting overlapping communities for all analysed data sets. Improved performance is particularly pronounced in cases of more complicated networked community structures

    Network-based models for social recommender systems

    Get PDF
    With the overwhelming online products available in recent years, there is an increasing need to filter and deliver relevant personalized advice for users. Recommender systems solve this problem by modeling and predicting individual preferences for a great variety of items such as movies, books or research articles. In this chapter, we explore rigorous network-based models that outperform leading approaches for recommendation. The network models we consider are based on the explicit assumption that there are groups of individuals and of items, and that the preferences of an individual for an item are determined only by their group memberships. The accurate prediction of individual user preferences over items can be accomplished by different methodologies, such as Monte Carlo sampling or Expectation-Maximization methods, the latter resulting in a scalable algorithm which is suitable for large datasets

    Collective Dynamics of Gene Expression in Cell Populations

    Get PDF
    The phenotypic state of the cell is commonly thought to be determined by the set of expressed genes. However, given the apparent complexity of genetic networks, it remains open what processes stabilize a particular phenotypic state. Moreover, it is not clear how unique is the mapping between the vector of expressed genes and the cell's phenotypic state. To gain insight on these issues, we study here the expression dynamics of metabolically essential genes in twin cell populations. We show that two yeast cell populations derived from a single steady-state mother population and exhibiting a similar growth phenotype in response to an environmental challenge, displayed diverse expression patterns of essential genes. The observed diversity in the mean expression between populations could not result from stochastic cell-to-cell variability, which would be averaged out in our large cell populations. Remarkably, within a population, sets of expressed genes exhibited coherent dynamics over many generations. Thus, the emerging gene expression patterns resulted from collective population dynamics. It suggests that in a wide range of biological contexts, gene expression reflects a self-organization process coupled to population-environment dynamics
    • …
    corecore